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Abstract 

A method to facilitate the consistent inclusion of cross-section measurements based on 
complex final-states from HERA, TEVATRON and the LHC in proton parton density func- 
tion (PDF) fits has been developed. This can be used to increase the sensitivity of LHC data 
to deviations from Standard Model predictions. The method stores perturbative coefficients 
of NLO QCD calculations of final-state observables measured in hadron colliders in look-up 
tables. This allows the a posteriori inclusion of parton density functions (PDFs), and of the 
strong coupling, as well as the a posteriori variation of the renormalisation and factorisation 
scales in cross-section calculations. The main novelties in comparison to original work on the 
subject are the use of higher-order interpolation, which substantially improves the trade-off 
between accuracy and memory use, and a CPU and computer memory optimised way to 
construct and store the look-up table using modern software tools. It is demonstrated that 
a sufficient accuracy on the cross-section calculation can be achieved with reasonably small 
look-up table size by using the examples of jet production and electro-weak boson (Z, W) 
production in proton-proton collisions at a center-of-mass energy of 14 TeV at the LHC. 
The use of this technique in PDF fitting is demonstrated in a PDF-fit to HERA data and 
simulated LHC jet cross-sections as well as in a study of the jet cross-section uncertainties 
at various centre-of-mass energies. 
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1 Introduction 



The Large Hadron Collider (LHC) at CERN will collide protons at a centre-of-mass energy of up 
to 14000 GeV. The combination of its high collision rate and centre-of-mass energy will make it 
possible to probe new interactions at very short distances. Such interactions might be revealed 
in the production of cross-sections of particles at very high transverse momentum (px) as a 
deviation from the Standard Model theory. 

The sensitivity to new physics depends on experimental uncertainties in the measurements 
and on theoretical uncertainties in the Standard Model predictions. It is therefore important 
to work out a strategy to minimise both the experimental and theoretical uncertainties from 
LHC data. Residual renormalisation and factorisation scale uncertainties in next-to-leading 
order (NLO) QCD calculations for single inclusive jet cross-sections are typically about 5 — 10% 
and should hopefully be reduced as NNLO calculations become available. However, in some 
kinematic regimes, PDF uncertainties can be substantially larger than the uncertainties from 
higher-order corrections, for example at large px- One strategy to reduce such uncertainty is to 
use single inclusive jet or Drell-Yan cross-sections at lower px to constrain the proton parton 
density function (PDF) uncertainties at high px- 

In order to further constrain PDF uncertainties, it would be useful to be able to include final 
state data such as px and rapidity distributions for W/Z -boson and jet production in global NLO 
QCD PDF fits, without recourse to inexact methods like the use of simple factor correcting of 
LO cross-sections (A;— factors). We propose here a method for a consistent inclusion of final-state 
observables in global QCD analyses. 

For inclusive data, like the proton structure function in deep-inelastic scattering (DIS) the 
perturbative coefficients are known analytically. During the fit the cross-section can therefore 
be quickly calculated from the strong coupling (a s ) and the PDFs and then be compared to the 
measurements. However, final state observables, where detector acceptances or jet algorithms 
are involved in the definition of the perturbative coefficients (called "weights" in the following), 
have to be calculated using NLO QCD Monte Carlo programs. Typically such programs need 
about one day of CPU time to accurately calculate the cross-section. It is therefore necessary 
to find less time consuming methods. 

Any NLO QCD calculation of a final-state observable involves Monte Carlo integration over 
a large number of events. For deep-inelastic scattering and at hadron colliders this must usually 
be repeated for each new PDF set, making it impractical to consider many 'error' PDF sets, or 
carry out PDF fits. Here, the "o posteriori" inclusion of PDFs is discussed, whereby the Monte 
Carlo run calculates a look-up table (in momentum fraction, x, and momentum transfer, Q) of 
cross-section weights that can subsequently be combined with an arbitrary PDF. The procedure 
is numerically equivalent to using an interpolated form of the PDF. 

Many methods have been proposed to solve this problem in the past [1—5]. In principle the 
highest efficiencies can be obtained by taking moments with respect to Bjorken-x [1,2], because 
this converts convolutions into multiplications. This can have notable advantages with respect 
to memory consumption, especially in cases with two incoming hadrons. On the other hand, 
there are complications such as the need for PDFs in moment space and the associated inverse 
Mellin transforms. 

Methods in x-space have traditionally been somewhat less efficient, both in terms of speed 
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and in terms of memory consumption. They are, however, somewhat more transparent since 
they provide direct information on the x values of relevance. Furthermore they can be used with 
any PDF. The use of x-space methods can be further improved by using methods developed 
originally for PDF evolution [6-8]. 

Our method [9] bears a number of similarities to that of the f astNLO project [10] and the 
two approaches were to some extent developed in parallel. Relative to f astNLO, we take better 
advantage of the sparse nature of the ^-dependent weights, allow for more flexibility in the scale 
choice by keeping explicitly the scale dependence as an additional dimension in the weighting 
table and provide a means to evaluate renormalisation and factorisation scale-dependence a 
posteriori. We also provide a broader range of processes, since in addition to di-jet production, 
we include W- and Z-boson production. In order to make easy use of the large number of weight 
files for practically all inclusive jet px spectra and di-jet mass spectra made available by the 
f astNLO project, we provide a software interface to make use of these weight tables within the 
APPLGRID framework. 



2 PDF-independent representation of cross-sections 
2.1 Representing the PDF on a grid 

We make the assumption that PDFs can be accurately represented by storing their values on a 
two-dimensional grid of points and using n th -order interpolations between those points. Instead 
of using the parton momentum fraction x and the factorisation scale Q 2 , we use a variable 
transformation that provides good coverage of the full x and Q 2 range with uniformly spaced 
grid points: 

1 Q 2 
y(x) = ln — \- a(l — x) and t(Q 2 ) = In In — — . (1) 

A^ 



x 



The parameter A should be chosen of the order of Aqcd, but need not necessarily be identical. 
The parameter a serves to increase the density of points in the large x region^ and can be chosen 
according to the needs of the concrete application!! 

The PDF f(x,Q 2 ) is then represented by its values qi yt i T at the 2-dimensional grid point 
(i y Sy, i T St), where 5y and St denote the grid spacings, and is obtained elsewhere by interpola- 
tion: 

rw dm _ 
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where n, n' are the interpolation orders. The interpolation function I- (u) is 1 for u = i, and 
otherwise is given by: 

7 W = {-l)-u{u-l)...{u-n) 
1 i\(n — 1)\ u — i 

Defining int(ii) to be the largest integer such that int(u) < u, k and k are defined as: 

k(x)= int(fi-^), -(Q 2 )=-t(^-^). (4) 



For a fixed total number of bins, as the bins at large x get finer, the low- a; ones become wider. 
2 In case of a = the function is analytically invertible, for a^Oa numerical inversion has to be applied. 



Given finite grids whose vertex indices range from, . . . N y — 1, for the y grid and, . . . N T — 1, for 
the r grid, one should additionally require that eq. ([2]) only uses available grid points. This can 
be achieved by remapping, k — ► max(0, mm(N y — 1 — n, k)), and, k — > max(0, min(iV r — 1— n', k)). 



2.2 Representing the final state cross-section weights on a grid (DIS case) 

To illustrate the method we take the case of a single flavour in deep-inelastic scattering (DIS). 

Suppose that we have an NLO Monte Carlo program that produces events, m = 1 . . . N. 
Each event m has an x value, x m , a Q 2 value, Q^, as well as a weight, w m . We define p m as the 
number of powers in the strong coupling a s in event m. Normally one would obtain the final 
result W of the Monte Carlo integration for one sub-process from|f| 

W=J2 W ™{^^) f( x m,Qm), (5) 
m=l 

r2\ 



2ir 

where f(x,Q' 2 ) is the PDF of the flavour under consideration 



(p) 

Instead one introduces a weight grid W i \^ and then for each event one updates a portion 
of the grid with: 
i = . . . n, t = . . . n' : 

w&L - <;:L + «* (^-*) ^ - «) , m 

where k = k(x m ), k = k(Q^). 

The final result for W, for an arbitrary PDF and an arbitray a s , can then be obtained subsequent 
to the Monte Carlo run: 

la (0 2 " r> \\ r 

* = EEE<1 A-^ /(.< i -».«3 2, - ) ) , (7> 
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where the sums with indices i y and i T run over the number of grid points and we have explicitly 
introduced x^ and Q 2 ^ such that: 

y(x^) = i y 5y and r (q 2( * t) ) = i T St. (8) 



2.3 Including renormalisation and factorisation scale dependence 

(p) 

If one has the weight matrix W i ^ determined separately order by order in a s , it is straightfor- 
ward to vary the renormalisation /ur and factorisation up scales a posteriori (we assume that 
they were set equal in the original calculation). 

It is helpful to introduce some notation related to the DGLAP evolution equation: 

_ ^l (Pa /)(,, <?) + (^) 2 (ft /)(«, « 2 ) + . . . , (9) 



3 Here, and in the following, renormalisation and factorisation scales have been set equal for simplicity. 



where the Pq and P\ are the LO and NLO matrices of DGLAP splitting functions that operate 
on vectors (in flavour space) / of the PDFs. Let us now restrict our attention to the NLO case 
where we have just two values of p in eq. For example, in jet production in DIS, j»lo = 1 and 
Pnlo = 2. Introducing £r and £p corresponding to the factors by which one varies /ir and \xf 
respectively, for arbitrary £r and £p we may then write: 




where /3q = (lliV c — 2n/)/(127r) and N c (n/) is the number of colours (flavours). Though this 
formula is given for an x-space based approach, a similar formula applies for moment-space 
approaches. Furthermore it is straightforward to extend it to higher perturbative orders. 

To obtain the full DIS cross-section a summation of the weights and the parton densities 
over the contributing sub-processes is required. 



2.4 The case of two incoming hadrons 

In hadron-hadron scattering one can use analogous procedures but with one more dimension. 
Besides Q 2 , the weight grid depends on the momentum fractions of the first (x\) and second 
(£2) hadrons. 

The analogue of eq. is given by: 

p 1=0 i vi i V2 ir \ J 

where n su b is the number of sub-processes and the initial state parton combinations F are 
specified in eqs. [El [20] and O 

The combinations of the incoming parton densities (defining the number of sub-processes) 
often can be simplified by making use of the symmetries in the weights. In the case of jet 
production only seven sub-processes are needed (see section l2,4.ip . The case of VF-boson and 
Z-boson production is treated in the Appendix A. The case of 6-quark production is discussed 
in ref. [11]. 

An automated way to find the sub-processes is discussed in Appendix B. 
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2.4.1 Sub-processes for jet production in hadron-hadron collisions 



In the case of jet production in proton-proton collisions the weights generated by the Monte 
Carlo program can be organised in seven possible initial-state combinations of partons: 
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D(x lt x 2 ), (12) 

where g denotes gluons, q, quarks and r, quarks of different flavour, q ^ r and we have used the 
generalised PDFs defined as: 



G H (x) = f /H(x,Q 2 ), 
D(x 1 ,x 2 ) 

D(x 1 ,x 2 ) 



t=-e 



Qh(x) = ^2fi/ H {x,Q 2 ), Qh(x) = fi/n(x,Q 2 

i=l 

6 

fi/H 1 {x\,Q 2 )fi/H 2 {x2,Q 2 ), 



(13) 



i=-6 



X] fi/H 1 {x\,Q 2 )f-i/H 2 {x2,Q 2 



i=-6 



where f^m is the PDF of flavour i = —6 ... 6 for hadron H and H\ (H 2 ) denotes the first or 
second hadron Q 



2.5 Including scale dependence in the case of two incoming hadrons 

It is again possible to choose arbitrary renormalisation and factorisation scales. Specifically for 
NLO accuracy: 



n sub -l / a t2(02V'rM \ -~~ 

w( (R , (F ) = E EEE " { t ' <^ fB {4"\4"\&q 2M ) + 



W^f + 2^ p LO h,e R Wg£l) F® (xt\xt\e F Q 2 ^) (14) 



2tt 

til ^F^i yi ,i y2 ,i T y i qi ^p omi \X 1 ,X 2 ,l; F Q j + £ q2 ^ Po ® q2 yx± ,X 2 ,t, F LJ 



4 In the above equation and in the following we follow the standard PDG Monte Carlo numbering scheme [12], 
where gluons are denoted as 0, quarks have values from 1-6 and anti-quarks have the corresponding negative 
values. 
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where Fq^p ^ qi is calculated as F®, but with q\ replaced with Pq <g> qi, and analogously for 



2.6 Reweighting to a different center-of-mass energy 

From a weight grid W calculated at a particular centre-of-mass energy y/s it is also possible to 
calculate a cross-section at a different centre-of-mass energy y/s' by using transformed parton 



momentum fractions and adding a flux factor in the cross-section convolution as given by 
eq. [TT] or eq. HH 

W(&,£F) = ^W(£fl,fr), (15) 

and the momentum fractions xu2 in the generalised parton densities F{x%, x%, Q 2 ) are replaced 
by: 

x' 12 = — x 1>2 . (16) 
y/s 1 

When y/s* < y/s it can occur that x\ > 1 or x' 2 > 1, in which case the parton densities should be 
set to zero. One should be aware that a jet transverse momentum that corresponds to moderate 
x values with centre-of-mass energy y/s (and correspondingly low density of grid points in x) 
may correspond to large x when using a smaller y/s'. In such cases, it can happen that the low 
density of grid points in x is no longer sufficient, given that PDFs vary more rapidly at large x 
than at moderate x. 



Special care is also needed when taking yl s' > y/s insofar as there will be kinematic regions 
accessible with the larger y/s* values that were not probed at all in the original NLO calculation 
at centre-of-mass energy y/s. As a concrete example, with y/s! = 14TeV, there can be events 
with three jets having respectively px = 6,4, 2TeV. Such events contribute to the inclusive jet 
spectrum at px = 4 TeV . However, taking a grid calculated with y/s = lOTeV (where such 
events are kinematically disallowed) and using it to determine the inclusive jet spectrum with 



3 Technical implementation 

To test the scheme discussed above, the NLO QCD Monte Carlo programs NL0JET++ [13] for 
jet production and MCFM [14, 15] for the production of W- and Z-boson are used. To illustrate 
the performance of the method jet and W- and Z-boson production are used as examples. 
However, it is worth noting that these these two programs give access to many of the NLO QCD 
calculations presently available. 

The weight grid W^^P s of eq.[[T]is filled (for each cross-section bin) in the user module of 
the NLO program, where one has access to the event weights and the partons' momenta. This 
object is called "grid" in the following. At this point the cross-section definition is specified and 
the physical observables that are being studied are defined (e.g. using a jet algorithm). 

The weight grid for each value of the observable in question is represented as a multidimen- 
sional object with one dimension each for x\, X2 and Q 2 , one for the sub-process in question, 
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and one for the order in a s . The task is to store the weight grid in such a way that as little 
memory as possible is used and the information can be extracted in a fast way. In the following 
several options to reduce the necessary memory are discussed: 

The simplest structure for a software implementation of the weight grid is a multidimensional 
array (for x\, X2 and Q 2 ), like the TH3D-class available in the ROOT analysis framework. 

The overhead of storing empty bins can be largely reduced by calculating the x%, xi and Q 2 
boundaries of the weight grid using the NLO QCD program in a special run before the actual 
filling step. At the beginning of the filling step the adjusted boundaries of the weight grids are 
then read-in and an optimised weight grid is constructed. 

Since the rectilinear region bounded by limits in x\, X2 and Q 2 may contain many phase- 
space points that are unoccupied, additional memory can be saved by using methods to avoid 
storing elements in the weight grid that are not filled. Since the occupied regions are continuous, 
but irregular, grid formats for truly sparse matrices (such as the Harwell-Boeing format) are 
not used. Instead a custom format is favoured where the grid, lower- and upper-limits in each 
dimension are stored along with all the elements in between. 

This is illustrated in Fig. [1] for a simple two-dimensional grid. For the three-dimensional 
structure, each of the row-column elements would itself be a column with its own lower and 
upper range delimiters. The resulting saving of memory is usually around a factor of four, even 
after taking into account the additional storage for the range delimiters H 

Figure 1: An example of the custom two- 
dimensional sparse structure. Rows and 
columns are numbered from to 20 from 
the top left. The elements with data 
members are shown filled, only rows 1 
to 17 have data members, for each row 
the columns that have data members are 
shown on the right. A total of 117 ele- 
ments, from the maximum of 400 elements 
are stored, along with the single pair of 
row- range delimiters, and the 17 pairs of 
column range delimiters for each of the in- 
dividual rows. 

Since the grid itself knows the index of the first and last filled element in each row, column 
etc., it is possible to only iterate over those elements of the grid that contain data. Similarly 
when interrogating the grid for the value of an element, it is possible to ascertain whether the 
element is in the occupied, or unoccupied region of the grid and return the value of the filled 
element if filled, or otherwise. This makes accessing the unfilled members of the grid much 
faster than otherwise. 

The actual implementation for the gricH involves a number of related classes written in 

5 If additional savings are required in the future, packing the range delimiters for each sparse one dimensional 
structure into a single integer will halve this additional overhead, but will slightly increase the access time due to 
the unpacking. 

6 The complete code including the interfaces to NLDJET++ and MCFM is available from 
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The grid for a given cross-section is represented by a concrete instance of a master class, 
appl : : grid. This class has a number of constructors that allow the cross-section it will calculate 
to be defined in terms of a fixed number of regular or variable width bins in the cross-section 
observable. 

For each bin in the observable, the master class has a number of instances of an internal 
class - one for each order of a s - so that for a cross-section with 10 bins, with contributions at 
leading order and next-to-leading order, the master class would contain a total of 20 instances 
of the internal class. 

This internal class, appl : igrid, encodes all the information required to create the cross- 
section, at one particular order, for that bin. The class contains the x-to-y, y-to-x and Q 2 - 
to-r, r-to-Q 2 transform pairs, and a subclass that encodes information on how to generate 
the N generalised internal sub-processes for the particular interaction from the basic parton 
distribution functions. It also contains instances of the sparse grid class in x%, x^ and Q 2 
described above, for each of the N sub-process. 

When requested to perform the convolution, the master class calls the convolute method 
of the subclass for each order of the cross-section in each bin. The convolute method of the 
subclass performs the convolution over x\, X2 and Q 2 for each of the sub-processes. 

For each bin in the observable, the master class takes the cross-sections from the subclasses 
for each order from each bin and adds them to arrive at the final cross section for that bin. 

The subclass for the generalised internal sub-processes are very basic classes which encode 
the number of sub-processes, i.e. seven in the case of jet production, and twelve in the case of 
Z-boson production, and simply take the 13 parton distribution values for each incoming hadron 
at a given scale, and generate the N internal processes from these. 

When the grid is saved to a ROOT filq3 the master class encodes the complete status of the 
internal grids, which transform pair, and which sub-process is required etc., so that once reading 
from the file, everything required to calculate the cross section (e.g. sub-process definition, 
CKM matrix elements etc.) is available. In this way all information to perform the cross-section 
calculation is available from the output file from a single function call by the user and the only 
additional information required is an input function for generating the PDFs and another one 
for calculating a s . We use the HOPPET program [8] to calculate the DGLAP splitting functions 
needed for the cross-section convolution when the renormalisation and factorisation is varied 
(see eq. [ID]) . 

All the various choices in the weight grid architecture and other information needed to 
calculate the cross-section are encoded in the output file. They are described in the following: 

• The centre-of-mass energy at which the weight grid has been produced. 

• The choice of the coordinate transform function. By default the form of eq. [T] is used. 
However, any other function can be provided by the user. 

• The interpolation order as given by eq. [2j 

|http:/ /svn. hepforge.org /applgrid 

'A FORTRAN interface is also available so that the basic functionality can be accessed from within user FORTRAN 
code. 

8 Technically, the grid is transformed to TH3D-histograms that are stored in the output ROOT file. 
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• The number of grid points to be used for each dimension x\, xi and Q 2 . 

• The definition of the sub-processes via a 13 x 13 matrix. 

• The CKM matrix elements or other constants needed to calculate the cross-sections. 

• The required number of the points on the grid can be optionally reduced with the aid of 
reweighting factor in the filling step. This flattens out the PDF in the region where it is 
steeply falling. 

By default the following functional form is used for the reweighting^]: 

w (x) = x ai (1 - 0.99 x) a \ (17) 

The parameter a% can be adjusted to flatten out the change of the PDF at low-x while the 
parameter a<i can be optimised for the high-x region. The factor 0.99 prevents the weight 
from being zero for x = 1. 

Reasonable values for the parameters a\ and 0,2 have been determined by fitting the sum 
of the up, down and gluon PDFs. For the CTEQ6 PDFs [16], values of a\ = —1.5 to —1.6 
and 02 = 3.0 to 3.4 have been found for the range 5 < Q < 5000 GeV. The variation 
comes from a slight dependence of the a± and 02 parameters on Q 2 . For other PDFs, the 
results of the fit can be slightly different. 

The user can change the parameters or provide another functional form. 



4 Accuracy of the weight grids 

The choice of the weight grid architecture depends on the required accuracy, on the exact cross- 
section definition and on the available computer resources. For each possible application the 
weight grid architecture has to be carefully chosen in order to achieve the required accuracy 
with the available computer memory and computing time. For instance, for observables where 
the PDFs are steeply falling, e.g. the inclusive jet cross-section at high transverse momentum 
in the forward region, a fine grid in x is needed. The memory usage of weight grids for one 
cross-section should be kept small since, e.g. in global PDF fits, it might be necessary to read in 
a large number of weight grids. In addition, the convolution time depends on the number of grid 
nodes, and so keeping memory requirements as small as possible is in any case desirable. The 
number of points needed in the weight grid is kept modest by using the higher-order interpolation 
functions of eq. [21 and optionally also by introducing a PDF weight, as in eq. [17] during the filling 
step, or by using a sparse structure. 

In the following, the influence of the grid architecture on the achievable accuracy in the 
cross-section calculation is discussed. The computer memory use and execution speed are also 
investigated. The production of jets and of W- and Z-bosons at LHC are used as examples. 

In our test runs, to be independent from statistical fluctuations (which can be large, in 
particular in the NLO case), in addition to the weight grid, reference histograms are filled using 
the NLO QCD calculation without weights in the standard way. The result obtained from the 
weight grid is then compared to these reference histogram. 

9 Such a PDF reweighting was first introduced in ref. [10]. 
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Figure 2: Ratio of grid and standard calculations of the single inclusive jet px spectrum for 
< y < 1 (a) and for 2 < y < 3 (b), for a variety of PDFs. The results are shown for the default 
weight-grid settings, i.e. 30 bins in x, 10 bins in Q 2 , a coordinate transform parameter a = 5 
and fifth order interpolation. 
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Figure 3: Ratios of grid and standard calculations of the single inclusive jet px spectrum for 
< y < 1 (a) and for 2 < y < 3 (b), illustrating the impact of varying the number of x-bins in 
the grid. All weight grids have 10 bins in Q 2 , a coordinate transform parameter a = 5 and fifth 
order interpolation. The PDF set is CTEQ6mE. 
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4.1 Jet production at hadron colliders 

The single inclusive jet cross-section as a function of the jet transverse momentum (pr) is 
calculated for jets in the central rapidity (y) region of < y < 1 and in the forward rapidity 
region of 2 < y < 3. Jets are defined via the seedless cone jet algorithm as implemented in 
NL0JET++, which corresponds to the seedless algorithm of ref. [17] (or SISCone [18]), except 
for small differences in the split-merge procedure which are irrelevant at this order. The cone 
radius has been set to R = 0.7, the overlap fraction to / = 0.50 The renormalisation and 
factorisation scales are set to Q 2 = p\ max , where pr,max is the px of the highest px jet in the 
required rapidity region^]. 

To discuss the dependence of the weight grid performance on the grid architecture, a default 
weight grid is defined from which variations in a single parameter are studied systematically. 
The default weight grid consists of 30 bins in x and 10 bins in Q 2 . The points are distributed 
according to eq. Q] with a = 5 and 5th order interpolation is used. No PDF reweighting (see 
eq. ED is used. 

The ratio of the cross-section calculated with the default weight grid to the reference cross- 
section calculation is shown in Fig. [2] for the jet cross section in the central rapidity region 
(0 < y < 1) (a) and the forward rapidity region (2 < y < 3) (b). The weight grid is produced in 
a run where the CTEQ6mE PDF [16] has been used to calculate the jet cross-section. This PDF 
is used as standard in the following. To show the independence of the weight grid performance 
on the used PDF, Fig. also includes more recent PDFs based on the analyses of a large variety 
of data (global analysis) like CTEQ6.6 [20] and MSTW2008 [21] or only using inclusive DIS data 
based on combined HI and ZEUS data (HERAPDF01) [22]. In addition, we include a PDF that 
does not use a parameterised input distribution NNPDF [23]. Further comparisons of the jet 
cross-sections calculated with these PDFs can be found in section [5j 

In the central region the cross-section calculated with the weight grid reaches an accuracy 
of about 0.1% for all tranverse jet momenta and all PDFs. In the forward region a similar 
performance is achieved for transverse jet momenta up to 1000 GeV. For transverse jet momenta 
above that value the performance degrades to 0.6% and a variation with the PDF is observed. 

The dependence of the accuracy on the number of x-bins is illustrated in Fig. [3j If only 25 
x-bins are used, the accuracy is 0.3% in the central and 0.6% in the forward rapidity region. The 
accuracy decreases towards low jet transverse momenta. More accuracy is achieved by a larger 
number of x-bins. For 30 bins the accuracy is 0.1%. For 40 x-bins the improvement is small, 
but visible. A very sensitive kinematic region is the forward region with very high transverse 
momenta. In this region at least 30 x-bins are needed to get an accuracy of 0.1%. 

Fig. [3] shows the dependence of the accuracy on the number of Q 2 -b'ms. This dependence is 
rather small. When a large enough number of x-bins is chosen, no change is observed for 8 to 
15 bins in Q 2 . 

10 These choices are related to the fact that some of the NL0JET++ runs were performed some time in the past. 
A modern cone-algorithm (in the class of those with a split-merge procedure) would be SISCone [18], and a value 
of / = 0.75 would be recommended [19]. 

11 Note that beyond LO the pr,max will in general differ from the pr of the other jets, so when binning an 
inclusive jet cross-section, the pr of a given jet may not correspond to the renormalisation scale chosen for the 
event as a whole. For this reason separate grid dimensions for the jet pr and for the renormalisation scale are 
used. This requirement has been efficiently circumvented in some moment-space approaches [2]. 
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The dependence on the interpolation order (as defined in eq. [3D is shown in Fig. While 
varying the default interpolation order n = 5 to n = 4 and n = 6 gives similar results within 
0.1%, the interpolation order n = 3 leads to an accuracy loss of 0.5% at low transverse jet 
momenta in the central regions, and by 0.4 — 1% at low and high transverse jet momenta in the 
forward region. 

In conclusion, the results in Figs. [2][5] demonstrate that an accuracy of 0.1 % can be reached 
with a reasonable weight grid size. The most critical parameter is the number of x-bins, which 
must be large enough to accommodate strong PDF variations in certain phase space regions. 
In comparison, the dependence on the number of Q 2 bins is rather weak. The interpolation 
between the grid points is sufficiently accurate to allow the grid technique to be used and fifth 
order interpolation produces reasonable results. The achieved accuracy is probably sufficient for 
all practical applications. 

In applications where a very small weight grid is needed, one can also introduce a PDF- 
weight to flatten out the x-dependence of the PDFs (see eq. fT7|) . The PDF weight is calculated 
using a\ = —1.5 and 02 = 3. This is illustrated in Fig. [6l where grids with very low number of 
x-bins (8, 9, 10) and eight Q 2 bins are used, the interpolation is lowered to n = 4. Even with 
the smallest weight grid an accuracy of 1% is achieved using the PDF- weight. For a somewhat 
larger weight grid with 10 x-bins the accuracy is 0.5% in all phase space regions. 

One of the important theoretical uncertainties in NLO QCD calculations is the variation 
of the results with the choice of the factorisation and renormalisation scale. Eq. [T5] allows the 
calculation of the cross-section for any scale choice a posteriori from one weight grid produced 
at a fixed scale choice. The results from scale variations by a factor of 2 up and down is 
shown in Fig. The renormalisation and factorisation scales are either varied together or 
varied individually. The weight grid result has been calculated with a single weight grid and the 
reference cross-sections have been calculated by repeating the standard NLO QCD calculation 
for each of the scale variations. The cross-section calculated with the weight grid reproduces 
the standard results to within about 0.1% in the central region and 0.1 — 0.2% in the forward 
region. 

4.2 Reweighting jet cross-sections to a different centre-of-mass energy 

As outlined in section 12.61 a weight grid produced at a given centre-of-mass energy can also be 
used to calculate the cross-section at a lower or higher centre-of-mass energy. This procedure 
works if the coverage in x in the weight grid is large enough. For instance, when lowering the 
centre-of-mass energy to calculate the jet cross-section at a fixed transverse jet momentum, it 
might happen that the required large x values are not present in the weight grid produced at 
a higher centre-of-mass energy. The variation of the centre-of-mass energy has therefore to be 
done with care by the user. 

As an example, the accuracy of the jet cross-section calculation using the default weight grid 
at a fixed centre-of-mass energy of yfs = 14000 GeV is investigated. Reference cross-sections 
are calculated at various centre-of-mass energies, i.e. 1800, 5000, 7000, 10000, 14000, 16000 
and 18000 GeV. Since the calculations at the various centre-of-mass energies are statistically 
independent, each reference cross-section as well as the default weight grid at y/s = 14000 GeV 
needs to be calculated with large event samples. Each of the calculations is done with 50 000 000 
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Figure 8: Ratios of grid and standard calculations of the single inclusive jet da/dxx spectrum, 
with xt = Ipt/^/s, for various centre-of-mass energies. The standard calculation has been 
performed separately for each centre-of-mass energy, while the grid results are all based on a 
common yfs = 14000 GeV grid. The PDF set is CTEQ6mE. In a) the default grid parameters 
are used (30 bins in x and 10 bins in Q 2 ). The last two points for y/s = 1800 GeV are drawn at 
1.3 for better visibility, but their true values are very large. In b) a larger grid with 50 bins in 
x and 30 bins in Q 2 is used. 



events produced with NL0JET++ . 

In order to make the comparisons more meaningful the jet transverse momentum pj- is trans- 
formed to xt = %Pt /y/s- For central jets the variable xt gives approximately the momentum 
fraction of the incoming parton with respect to the proton. Fig. (5^l) shows the ratio of the 
cross-section calculated with the standard weight grid produced at = 14000 GeV to the 
cross-section calculated at various centre-of-mass energies in the standard way as a function of 
xt- For most points, the calculations agree within 2%. The observed fluctuations are statistical. 

For large changes in centre-of-mass energy and large xt values the approximation of the 
standard grid becomes inaccurate. For instance, for y/s = 1800 GeV and xt = 0.6 the weight 
grid calculation gives a result that is 10% higher than the standard calculation. This discrepancy 
increases further for large xt values. The ratio of the last two xt values becomes very larger 2 ! 

Fig. [8b) shows the result for a larger grid using 50 bins in x and 20 bins in Q 2 . With such a 
grid the deviations are mostly reduced to statistical fluctuations. Only the largest xt value for 
the lowest centre-of-mass energy exhibts a deviation by 30 % from the standard calculation. 

A small grid with a PDF weighting leads to large discrepancies to the standard calculation 
and cannot be used. 

In conclusion, the grid technique gives a good accuracy to compute the jet cross-section at 
various centre-of-mass energies. For very high transverse momenta and extreme centre-of-mass 
variations a large grid might be required. 

12 In Fig. |8^.) they are drawn at 1.3 for better visibility of the rest of the points. 
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Figure 10: Ratios of grid and standard calculations of the positron px spectrum in Ty + -boson 
production, for \rj e +\ < 0.5 (a) and for |?7 e +| > 3 (b). Results are shown for four grids, each with 
a different interpolation order. All grids have 25 bins in x and a coordinate transform parameter 
a = 5. The PDF set is CTEQ6mE. 



4.3 W-boson production at hadron colliders 

To further demonstrate the performance of the weight grid method, the production of VF-bosons 
at LHC energies is taken as example. The observable that will be examined is the transverse- 
momentum distribution of the positron from W + -boson decays, when the positron is either 
central \rj\ < 0.5, or very forward, \rj\ > 3.0. 

As in the previous section, a default weight grid is defined and variations in a few parameters 
are studied. The default weight grid consists of 25 bins in x. The points are distributed according 
to eq. Q] with a = 5 and a fifth order interpolation is used. No PDF weight (see eq. fTTj) is used. 
The cross-sections are calculated with the factorisation and renormalisation scale fixed to the 
mass of the VF-boson. Therefore, the weight grid need only be two dimensional. 

The influence of the number of bins in x is shown in Fig. If the number of bins in x is too 
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production, for \r] e +\ < 0.5 (a) and for \rj e +\ > 3 (b). Results are shown for four grids, each 
with a different coordinate transform parameter, a. All grids have 25 bins in x and fifth order 
interpolation. The PDF set is CTEQ6mE. 



=8,lnt.:(5,5,0);a=1;W 
=10, Int.: (5, 5, 0); a = 1; W ar 
=12, Int.: (5, 5, 0); a = 1; W ar 
" Dn [<0.5 




— C 



J.003: 
1.002- 

1.001 z 

A 



T" 1 



=8,lnt.:(5,5,0);a=1;W =1 
=10, Int.: (5, 5, 0); a = 1; W pop = 1 
=12, Int.: (5, 5, 0); a=1;W =1 

1 " PDF 

"•"j>3.0 



500 



0.999 
0.998- 
0.997- 
0.996h^ 



LIT 



" J 



■ L~ -r — - 



50 100 150 200 250 300 350 



' (GeV) 



b) 



400 450 500 

ppoaitron (QeV) 



Figure 12: Ratios of grid and standard calculations of the positron px spectrum in Ty + -boson 
production, for |ry e +| < 0.5 (a) and for \rj e +\ > 3 (b). Results are shown for grids with a reduced 
number of x bins and PDF reweighting. All the grids use second order interpolation and a 
coordinate transform parameter a = 1. The PDF set is CTEQ6mE. 



small (Nbins = 20) the cross-section is reproduced to about 0.5% in the central region and 0.2% 
in the forward region. 

For the default weight grid, lowering the interpolation order from n = 5 to n = 4 results in 
an accuracy loss of about 0.2% over much of the pt range, as shown in Fig. [TU1 The accuracy 
for positrons with low transverse momenta degrades to 0.8%. The good precision for n = 5 can 
only be improved using n = 7. 

Fig. I 111 shows the dependence on the grid spacing parameter corresponding to the parameter 
a in eq. [TJ The x-values in the cross-section calculation are not large and consequently a fine 
spacing at large x (corresponding to a large a parameter) is not needed and the result improves 
for low a values. An accuracy of better than 0.1% is achieved for all variations. 

Finally, Fig. [T2l shows that, if a PDF weighting is used, it is possible to use very small grid 
sizes. For a weight grid with only eight x-bins an accuracy of 0.1% can be achieved. In this case 
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the gain in accuracy is small when increasing the number of x-bins. Only in the forward region 
and for high transverse energies the increase in the number of x-bins is beneficial. 

In summary, a sufficient accuracy is achieved with about 25 x-bins and a fifth order inter- 
polation. An equidistant grid spacing (a = 0) is sufficient. 



4.4 CPU and computer memory performance 

The execution time for each call to the filling routine for the grid has been studied on a 1.5 GHz 
PowerPC and a 3 GHz Intel Xeon running Linux, using a dummy structure with N points in 
each dimension. Fig. [13] shows the performance for various grid architectures. The grids are 
based on either the ROOT TH3D class, the custom sparse class (SparseMatrix3d) described in 
the section [3l or the TMatrixDSparse class which implements the 2-dimensional Harwell-Boeing 
matrix representation. In the latter case, a sparse 1-dimensional structure of TMatrixDSparse 
matrices using the classes of the SparseMatrix3d has been used to create a sparse 3-dimensional 
structure. As expected the Harwell-Boeing based class is very quick for filling when the grid is 
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Figure 13: The time per call for filling grid classes based on various grid architectures on a 
1.5 GHz PowerPC (left) and 3 GHz Linux PC (right). 

small, but as the grid size becomes larger, since the occupation is reasonably large, the number 
of entries that must be examined becomes large and the filling time increases rapidly. For the 
TH3D and custom sparse structures, the filling time is largely independent of the grid size. 

The reduction in memory occupied by the custom sparse grid structure after trimming away 
unoccupied elements is illustrated in Fig. [T3J The bottom-left plot shows the absolute size of 
the stored elements in MBytes, both before, and after trimming away unfilled elements. The 
top-left plot shows the fraction of the total, untrimmed grid size, occupied by the filled elements. 
As the grid spacing decreases, the overall grid size naturally increases. 

The execution time using the grid to perform the final cross-section calculation including 
the PDF convolution has also been studied using a 1.5 GHz PowerPC. The results are based 
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Figure 14: a) Memory used for the default grid architecture using a custom sparse grid 
(untrimmed) and after removing the unoccupied elements. The top figure shows the ratio of 
the reduced to the full case, b) Time needed to calculate the cross-section by convoluting the 
coefficients on the grid with PDFs and a s . The convolution times are measured on a 1.5 GHz 
PowerPC for a default grid. The memory and the CPU time performance is evaluated for the 
VF-boson cross-sections as a function of the electron rapidity and transverse momentum. 
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on calculations of differential cross-sections with respect to the positron pseudo-rapidity and 
transverse momentum in TV-boson production using MCFM [14, 15], as presented in section I4~3l 
The cross sections involve 20 and 24 bins for the lepton pseudo-rapidity and transverse energy 
distributions respectively. Fig. 114b shows the convolution time for grids with N bins in dimen- 
sions x\ and X2 for the sparse structure. Results are given for the trimmed and untrimmed 
structures In the case of the untrimmed grid, all data elements are retained in the convolution, 
even those with no entries. 

Excluding the unfilled data elements in the convolution improves the convolution time by 
a factor approaching two. In addition, we see that the convolution time varies approximately 
linearly with the grid linear dimension. This is because the most costly part of the convolution is 
the calculation of the PDF at the grid nodes. With independent grid nodes for x\ and X2, there 
are 2N evaluations of the PDF for each observable bin, and so the convolution scales linearly 
with N. 

In conclusion, the custom sparse structure using trimmed blocks gives the best performance. 

5 Application example: Calculation of NLO QCD uncertainty 
for inclusive jet cross-sections for proton proton collisions at 
various centre-of-mass energies 

As an example in this section the uncertainties of the inclusive jet cross-section in the central 
region (0 < y < 1) are evaluated from the default grid obtained at a centre-of-mass energy 
of y/s = 14000 GeV. The jet cross-sections are calculated at various centre-of-mass energies. 
The most recent PDF parameterisations along with their associated uncertainties are used, i.e. 
CTEQ6.6 [20], MSTW2008 [21], HERAPDF01 [22] and NNPDF [23]. 

Fig. US shows the effect of the PDF uncertainty from CTEQ6.6 (a), MSTW2008 (b), HER- 
APDF01 (c) and NNPDF (d) on the inclusive jet cross-section with respect to the central value 
of the somewhat older PDF, CTEQ6mE [16]. The uncertainty from the CTEQ6mE PDF is also 
overlayed. The band illustrates the result of adding the jet cross-sections obtained for each of the 
PDF variational. The marker indicates the central value. Fig. [16] shows the PDF uncertainty 
together with the renormalisation and factorisation scale uncertainty added in quadrature with 
respect to the central value for each of the PDFs. 

For pt < 1000 GeV the jet cross-section obtained with CTEQ6.6 is about 2% smaller than 
from CTEQ6mE. Above this value the CTEQ6.6 cross-section increases with respect to the one 
from CTEQ6mE as the jet px increases. At px = 2000 GeV it is about equal and at px = 4000 
GeV it is about 10% larger. The uncertainty is reduced for the CTEQ6.6 PDF. The uncertainty 
is about 3% for p T < 500 GeV, about 8% at p T = 1000 GeV and about 20% at p T = 3000 
GeV. 

The MSTW2008 PDF gives a jet cross-section that is 5% larger than the one obtained with 

13 The uncertainty band is obtained using eq. 51 and eq. 52 in ref. [21] for the HERA, MSTW and the CTEQ 
PDFs. This formula has also been suggested earlier in ref. [24]. For the NNPDF eq. 164 in ref. [23] is used. The 
uncertainty in the NNPDF corresponds to the standard deviation of all variations, while in the case of the other 
PDFs it corresponds to the 90% confidence limit. For better comparison, the uncertainties of the NNPDF and 
the HERAPDF01 have been scaled up using eq. 165 in ref. [23]. 
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Figure 15: PDF uncertainty of the inclusive jet cross-section for jets within < y < 1 as a 
function of the transverse jet momentum px for a centre-of-mass energy of y/s = 14000 GeV. 
Shown is the jet cross-section uncertainty induced by the CTEQ6mE PDF and the CTEQ6.6 (a) 
the MSTW2008 (b), the HERAPDF (c) and the NNPDF PDF (d). The reference cross-section 
(To is the one obtained by the central value of the CTEQ6mE PDF. The default PDF is indicated 
by a marker. 
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Figure 16: Uncertainty of the inclusive jet cross-section for jets within < y < 1 as a function 
of the transverse jet momentum px at fixed centre-of-mass energy ^/s = 14000 GeV. Shown is 
the ratio of the cross-section with varied PDFs and renormalisation and factorisation scales (a) 
to the cross-section calculated with the central value of each PDF set and no scale variation, i.e. 
fj, r = [if = 1 (co). The inner uncertainty band shows only the PDF uncertainty. The outer band 
shows the PDF and the scale uncertainty added in quadrature. The uncertainty of CTEQ6.6 is 
shown in a), of MSTW2008 in b), of HERAPDF in c) and of NNPDF in d). 
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CTEQ6mE at pt < 500 GeV and is about the same at pt = 1000 GeV and then further 
decreases. The uncertainty is only about 2 % for pt < 500 GeV and then increases to about 
6% at p T = 1000 GeV. The MSTW2008 PDF gives a smaller uncertainty than the CTEQ6.6 
PDF. It seems that the differences between the jet cross-section calculated with CTEQ6.6 and 
MSTW2008 are a bit larger than the individual uncertainties. 

The result obtained with the HERAPDF01 is more similar to the one obtained from MSTW2008 
than the one from CTEQ6.6. At low pt the central value is about 2% higher than the one from 
CTEQ6mE. In the region 500 < p T < 1500 GeV the HERAPDF01 predicts a lower jet cross- 
section than the other PDFs. The uncertainty is about 5% for pt < 1000 GeV and then increases 
to about 20 — 40% at pt = 3000 GeV. The small uncertainty of the jet cross-section calcu- 
lated with the HERAPDF01 is remarkable, since only DIS data are used. However, model and 
parametrization uncertainties are not included in this cross-section calculation. The MSTW and 
CTEQ sets do not yet include the most recent HERA data. The NNPDF predicts jet cross- 
sections that are 5 — 10% higher than the one from the other PDFs; in particular in the region 
300 < p T < 1000 GeV. The uncertainty is about 5% at low p T , 10% at 1000 GeV and 20 - 30% 
at 3000 GeV. 

The overall uncertainty, i.e. including the PDF and the scale variation added in quadrature, 
is shown in Fig. [TBI It is about 8% up to a pr of about 1000 GeV and then increases towards 
higher pr- It is about 20 — 30% at pt = 3000 GeV. For very high pt the PDF uncertainty 
dominates. 
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Figure 17: a) Inclusive jet cross-section for jets within < y < 1 and with transverse jet 
momenta pt > 100 GeV, pt > 300 GeV and pt > 500 GeV as a function of the centre-of-mass 
energy ^/s. b) Shows the same as a), but all results are normalised to \fs = 5000 GeV. The 
markers indicate the results calculated at each centre-of-mass energy in the standard way. The 
lines indicate the results deduced from the default weight grid produced for a centre-of-mass 
energy at y/s = 14000 GeV. 



Fig. fTTh ) shows the total inclusive cross-section for central jets (0 < y < 1) integrated for 
Pt > 100 GeV, pt > 300 GeV and pr > 500 GeV as a function of the centre-of-mass energy. The 
markers denote the reference cross-section calculated in the standard way. The lines are obtained 
from a weight grid produced at ^fs = 14000 GeV. The cross-section calculation from the default 
weight grid reproduces the reference cross-sections within 1 — 2% (see also section 14. 2p . For 
each jet transverse momentum threshold the total jet cross-section rises with increasing centre- 
of-mass energy. Fig. 117b) shows the centre-of-mass energy dependence of the jet cross-section 
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Figure 18: Uncertainty of the inclusive jet cross-section for jets with transverse momenta pt > 
100 GeV and within < y < 1 as a function of the centre-of-mass energy ^/s. Shown is the 
ratio of the cross-section with varied PDFs and renormalisation and factorisation scale (cr) to 
the cross-section calculated with the central value of each PDF set and no scale variation (ctq). 
The inner uncertainty band shows only the PDF uncertainty. The outer band shows the PDF 
and the scale uncertainty added in quadrature. The uncertainty from CTEQ6.6 is shown in a), 
from MSTW2008 in b), from HERAPDF in c) and from NNPDF in d). 
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normalised to the jet cross-section at 5000 GeV for each jet transverse momentum threshold. 
As expected the centre-of-mass energy dependence is strongest for high transverse jet momenta. 

Fig. [TS] shows for each of the considered PDF sets the PDF uncertainty along with the renor- 
malisation and factorisation scale uncertainty added in quadrature for jets with px > 100 GeV 
as a function of the centre-of-mass energy yfs. Both the PDF and the scale uncertainties only 
depend weakly on the centre-of-mass energy. For high centre-of-mass energies the uncertainties 
are a bit smaller. 

6 Application example: PDF fit including DIS data and jet 
production data at hadron colliders 

An important application of the method outlined above, is the consistent inclusion of final 
state measurements from hadronic colliders into the final extraction of PDFs by NLO QCD 
fits. Measurements of final states - such as jet production or the production of lepton pairs 
via the Drell-Yan process - can provide important additional constraints on the proton PDFs, 
complementary to those from inclusive DIS data. 

As a simple "proof-of-principle" example, the grid technique outlined in this paper has been 
used to include simulated LHC jet data into a NLO QCD fit. The fit framework used here is 
based on the recent ZEUS-JETS PDF, derived from a fit to inclusive DIS and jet data from 
HERA. Jet cross-sections from the TEVATRON or any other data than that from HERA are 
not used. Full details of the data-sets, PDF parameterisation and other assumptions are given 
elsewhere [5]. 

To represent the LHC data for inclusion in the fit, jet production from proton-proton col- 
lisions at a centre-of-mass energy of 14000 GeV was simulated using the JETRAD [25] program, 
using the CTEQ6.1 PDF [26]. Single inclusive jet cross sections, differential in pr, were obtained 
in three regions of rapidity: < \y\ < 1, 1 < \y\ < 2 and 2 < |y| < 3. A grid with default 
parameters, as described in Sec. 14. H was produced and interfaced to the ZEUS NLO QCD fit 
program. Several fits were performed, using different assumptions on the statistical and sys- 
tematic uncertainties on the simulated data. The PDF uncertainties were calculated using the 
Hessian method [27,28], with A X 2 = S 

A representative result is shown in Fig. [19] In this example, the statistical uncertainty on 
the simulated LHC jet data corresponds to an integrated luminosity of 10 fb^ 1 and uncorrelated 
systematic uncertainties have been assumed to be at a level of 5%. A precise jet energy scale 
uncertainty of 1% (corresponding to ~ 5 — 15% on the generated cross-sections) has also been 
assumed, and is included as a correlated systematic in the fit. Fig. [19] a) shows the up-valence, 
down-valence, total sea and gluon PDF distributions as a function of x, at Q 2 = 10000 GeV 2 . 
The shaded band shows the results of the fit including the simulated LHC jet data. In Fig. [19] 
b), the fractional uncertainties on the gluon PDF, at a number of Q 2 values, are shown. 

14 Note that a using A\ 2 = 1 in the Hessian method is generally considered to underestimate the PDF un- 
certainties. However, the main aim of this study is to provide a proof-of-principle example of the use of the 
grids discussed in this paper, and not to provide qualitative estimates of weight expected PDF uncertainties. 
Furthermore, all fits shown in this section have used the same definition of the PDF uncertainties, such that any 
comparison should still be valid. 
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Comparison with the results from a fit which does not include the simulated LHC jet data 
indicates that some constraint on the high-x gluon could be provided by the LHC single inclusive 
jet datst^l. However, this is reliant on a very precise knowledge of the jet energy scale. 

In fact, according to this study, a precise knowledge of the jet energy scale is the key factor. 
Other fits, which assumed a smaller integrated luminosity (1 fb _1 ) or larger uncorrelated system- 
atics (10%), still indicated an improvement on the gluon uncertainties, provided the jet energy 
scale uncertainty was kept at a level of ~ 1%. However, fits in which the latter uncertainty was 
assumed to be larger, indicated little or no improvement in the gluon uncertainty compared to 
the reference. More details can be found in ref. [31]. 




a) X b) 

Figure 19: The distributions of the up- valence, down- valence, total sea and gluon PDFs (a), and 
the fractional uncertainty on the gluon distribution at a number of Q 2 values (b) as a function 
of the parton momentum fraction x. The results from the fit using weight grids to include 
simulated LHC jet data is shown by the shaded band. For comparison, in (b), the results of 
the ZEUS NLO QCD fit are also shown, indicated by the hatched band. The simulated LHC 
jet data included in the new fit assume a statistical uncertainty corresponding to an integrated 
luminosity of 10 fb _1 , uncorrelated systematics of 5% and a jet energy scale uncertainty of 1%. 

Such precision on the jet energy scale is achievable, but will require a lot of experimental work 
on the understanding of the LHC detectors. The inclusion of TEVATRON jet cross-sections in 
the NLO QCD fit might provide further constrains. However, it may be the case that ratios of 
jet cross sections - for example, in different rapidity regions - may have substantially smaller 
systematic uncertainties, while retaining sensitivity to the gluon density in the proton. Further 
constraints on the proton PDFs are also expected from Drell-Yan data measured at LHC or any 
other data than those from HERA. Such data sets can now be consistently included in NLO 
QCD fits. 

15 Note that the fit without the simulated LHC jet data is not identical to the ZEUS- JETS fit since the standard 
ZEUS fit [5] uses the Offset method to determine the PDF uncertainties. The ZEUS fit shown here is a modified 
version of the published analysis, with uncertainties determined using the Hessian method, with A^ 2 = 1. 
Different treatments of experimental uncertainties in PDF analyses are discussed extensively elsewhere [27-30]. 
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Conclusions 



A technique has been developed to store the perturbative coefficients calculated by a NLO QCD 
Monte Carlo program in a look-up table (grid) allowing for a posteriori inclusion of an arbitrary 
parton density function (PDF) set and of alternative values of the strong coupling constant as 
well as for a posteriori variation of the renormalisation and factorisation scale. This extends a 
technique that has already been successfully used to analyse HERA data to the more demanding 
case of proton-proton collisions at LHC energies. 

The technique can be used to constrain PDF uncertainties by allowing the consistent inclu- 
sion of final state observables in global QCD PDF fit analyses. This will help to increase the 
sensitivity of the LHC to find new physics as deviations from the Standard Model predictions. 

An accuracy of better than 0.1% can be reached with reasonably small look-up tables for the 
single inclusive jet cross-section in the central rapidity region \y\ < 1, for jet transverse momenta 
(pr) from 100 to 4500 GeV and about 0.2% for jets in the forward rapidity region 2 < y < 3. 
Similar accuracy can be achieved for the differential cross-sections in rapidity and transverse 
momentum of electrons produced in Z and tU-boson decays. This was examined in the central 
y < 0.5 and very forward y > 3 regions for transverse momentum up to px < 500 GeV. 

The look-up tables provide a powerful tool to quickly evaluate the PDF and scale uncertain- 
ties of the cross-section at various centre-of-mass energies. The most recent PDFs predict jet 
cross-sections in the central rapidity region within a few percent accuracy over a large range of 
jet transverse momenta. 

This technique has been successfully applied to a PDF fit using inclusive deep-inelastic 
scattering and jet data measured at the electron-proton collider HERA and using simulated 
LHC jet cross-sections. An improvement on the uncertainty of the gluon density can only be 
achieved if the jet energy scale is very precisely known. A more comprehensive analysis will 
be possible in the future, since the grid technique can be applied to most of the available NLO 
QCD calculations. 
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Appendix A: sub-processes for W- and Z-boson production 



The production of W- and Z-bosons in proton-proton collisions involves flavour-dependent 
electro- weak couplings. Therefore, the number of sub-processes that need to be defined is larger 
than in the case of jet production. To reduce the number of sub-processes as much as possi- 
ble, quarks are assumed to be massless and the CKM matrix elements [32, 33] to describe the 
contributions of the various quark flavours are used. 

In the case of Z-boson production 12 combinations of initial state partons need to be distin- 
guished: 
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where g denotes gluons and U(D) denotes up (down)-type quarks. Use is made of the generalised 
PDFs defined as: 
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(19) 



where /jm is the PDF of flavour i = —6 ... 6 for hadron H and Hi (H2) denotes the first or 
second hadron. 
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In the case of VF + -boson production 16 ! 6 initial state combinations are needed: 
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where the generalised PDFs are used. They are defined as: 
G H {x) =fo/ H {x,Q 2 ) , 

U H {x) =h, H (x, Q 2 ) {V 2 d + Vl) + h /H (x, Q 2 ) {V 2 d + V 2 ) , 
D H (x) =f_ 1/B (x, Q 2 ) (V 2 d + V 2 d ) + f_ 3/H (x, Q 2 ) (V 2 S + V 2 ) , 

5 , l 2 (xi,X 2 ) =/-3/Hi (^1 5 Q 2 ) /2/H2 ( x 2,<3 2 ) K 2 S + 
(*1,Q 2 ) /VHa (^,Q 2 ) K 2 + 

/-i/hx (^i,Q 2 )/2/H 2 (^,g 2 )K 2 d+ 

/-1/ffc (^l,Q 2 )/4/H 2 (a;2,Q 2 )^, 
5'2l(xi,X 2 ) =72/^ (^1,Q 2 ) f-3/H 2 {X2,Q 2 ) V 2 S + 

f A/Hl {x 1 ,Q 2 )f. 3/H2 {x 2 ,Q 2 )V 2 + 

f2/ Hl {xi,Q 2 )f-l/H 2 { X 2,Q 2 )V 2 d + 

h/ Hl {xi, Q 2 ) f-i/H 2 {X2, Q 2 ) V 2 d , (21) 

where Vy are the CKM matrix elements 

For simplicity in the former equations we omitted the top contribution, since the parton 
densities are zero for most practical applications. 



16 The case of W~ -boson can be treated in an analogous way. 

17 The CKM matrix elements are stored together with the weight grid in the same file. This ensures that 
the same values are used in the NLO calculation and in the PDF combinations. This choice can be changed a 
posteriori according to the needs to the user. In MCFM, only four non-zero CKM matrix elements are used. 
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Appendix B: Automated identification of sub-processes 



In general there are 169 (13 x 13 flavour) possible PDF combinations for proton proton collisions. 
In order to store only the minimal amount of information, one needs to establish which of 
those combinations always come with correlated weights, or equivalently one should identify the 
underlying physical sub-processes. So far, for each process under study, the sub-processes have 
been identified manually, on a case-by-case basis. However, the sub-processes can also be found 
in an automated way. 

To simplify the discussion (without loss of generality) it will be convenient to assume that 
the PDFs are always evaluated at fixed values of x. For each event i and for each of the 169 
PDF combination j (with PDF weight pj), the NLO QCD program calculates matrix-element 
weights Wij . The total weight for the event i is Ylj Wij Pj ■ The PDF combinations are called 
channels in the following. 

To identify the sub-processes, one determines the Wij weights for 169 events, giving a 169 x 
169 matrix, whose i (event) index labels the rows and whose j (PDF channel) index labels the 
columns. One then carries out an eigenvalue decomposition of the Wij matrix. If v n denotes the 
n th eigenvalue and L n and R n the left and right eigenvectors (with components L n j, etc.), then 
as long as the there are no degenerate eigenvalues, an orthonormality relation can be written: 

Ln ' Rrn — &mm (22) 

where the normalisation is our specific choice. Then one can rewrite the Wij matrix as 

Wij = Rni v n L nj , (23) 

n 

it being straightforward, for example, to verify that both sides satisfy Yli L n iWij = v n L. 



and Y^j WijR m j = v m R m i. Let us now assume that only N of the eigenvalues are non-zero 
Then eq. [23] can be interpreted as follows: there are N relevant sub-processes; each n < N 
corresponds to a sub-process that multiplies a linear combination of PDF channels in which 
the contribution of channel j is L n j. In event i, sub-process n comes with a weight R n iV n . It 
will be convenient to denote this by W{ n . By virtue of the orthornormality condition eq. [22l we 
have that Wi n = Y2j WijR n j, i.e. to determine the weight of sub-process n (whose PDF channel 
combination is given by the left eigenvector L n j) we take the right-eigenvector that corresponds 
to this channel and use it to right-multiply the full weight matrix, so to as to eliminate all but 
the contribution to the n th sub-process. 

The next step is to observe that the sub-processes determined for the first 169 events should 
hold for all remaining events^! So now for any event i, we can determine the weight for sub- 
process n, Wi n = Ylj ^ijRnj, using the R n j determined from the initial events. Having stored 
the Wi n one can then subsequently reconstruct the full Wij as Wij = ^2 n Wi n L n j. 

We have verified, in the context of a number of MCFM processes, that this approach is viable 



18 This contradicts the requirement that the eigenvalues be non-degenerate. In practice the rounding errors in 
the original calculation of the Wij cause the nominally zero eigenvalues to be slightly non-zero, thus alleviating 
this issue in practice. 

19 This is guaranteed as long as the NLO Monte Carlo weights include all sub- processes for each of the first 169 
events. 
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in practical However, it is has yet to be fully integrated with the rest of our grid code and the 
results shown above are based on the manual sub-process decompositions explicitly spelled out 
in sections 12.4.11 and in the Appendix A. 

One should be aware that while the automated suprocess decomposition yields a number 
of subprocesses that is identical to what can be found with manual decomposition, the specific 
linear combinations of PDF channels are usually different. To help understand why, one can 
take the example of jet production with the 7 subprocesses of eq. [T2J There, rather than using 
qq and qq channels, one might have chosen instead to store weights for the combinations of 
qq + qq and qq — qq channels. More generally, one would have been free to base the grid on 
any 7 linearly independent combinations of the channels of eq. [TZl For the automated channel 
decomposition process, the particular independent linear combinations that emerge depend on 
the random weights of the events used to identify the channels. 
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